Course - 16825 : Learning for 3D Vision
Name - Parth Nilesh Shah
AndrewId - pnshah
Date - 2/25/2024
| Ground Truth | Optimized |
|---|---|
| Ground Truth | Optimized |
|---|---|
| Ground Truth | Optimized |
|---|---|
| Image | Ground Truth | Prediction |
|---|---|---|
| Image | Ground Truth | Prediction |
|---|---|---|
| Image | Ground Truth | Prediction |
|---|---|---|
| Type | Avg F1 Score @0.05 |
|---|---|
| Voxel | 44.534 |
| Pointcloud | 84.523 |
| Mesh | 71.633 |
Note : Not using the updated code, hence I have the lower scores for voxel prediction
| Evaluation Voxel |
|---|
| Evaluation Point Cloud |
| Evaluate Mesh |
Intuitive Explanation
Experiment - Changing n_points in Pointcloud
The decoder architecture for my Image2Pointcloud is as follows -
LinearLayer (512, n_points) -> Relu -> Linear(n_points, n_points * 3) -> Tanh
Since my inner layer was dependent on n_points, so I decided to play around with changing n_points and see how it affects the results.
| Image | GT | 500 | 2000 | 5000 |
|---|---|---|---|---|
| F1score@0.05 | - | 71.354 | 79.635 | 84.621 |
We can see that as the number of points increases, the performance of the model increases as well.
I interpret it as follow, as the number of points increases, the number of connections in the hidden layer increasing making the network more expressive.
Performed an expertiment where I trained a PointCloud Prediction network on the full dataset.
Qualitative Results -
| Image | Ground Truth | Prediction |
|---|---|---|
Quantitative Results -
avg F1 @ 0.05 = 91.45%
| Evaluation on 3 classes |
|---|
A couple of things I noticed, I had to train for nearly around 20000 iterations for the full the loss to converge which was nearly thrice the number of iterations it took when training on one class.
Qualitatively the predictor seems to do better on the Planes and Cars class rather than the chairs.